Synopsis

This file is intended to show the rankings of 25 Latin American countries given as a result of answering 86 questions. The main objective is to process the original data and display it using Leaflet (interactive maps)

Data Processing

The initial step is to load the required libraries, loading, and cleaning the original data

## Libraries
library(rgdal)
library(leaflet)
library(tidyr)
library(dplyr)
library(xlsx)

The file is given in an .xlsx format. After inspecting the file, it was evident that the file required some manipulation. The following script achieves the following:

  1. Loads only the numerical data, i.e., without the headers
  2. Loads the first line of the file to get the countries
  3. Adds the countries names to the respective columns
  4. Changes the data to contain the appropriate class
  5. Removes the unnecessary NAs
## Loading the data
data <- read.xlsx2("BD-RankingPoderJudicial-AL.xlsx", header = FALSE, 
                  startRow = 3, endRow = 88, sheetIndex = 1, stringsAsFactors = FALSE)

data.countries <- read.xlsx2("BD-RankingPoderJudicial-AL.xlsx", header = FALSE, 
                   startRow = 1, endRow = 1, sheetIndex = 1, stringsAsFactors = FALSE)


## Adding the country names, and removing unnecessary columns
colnames(data)[4:28] <- data.countries[4:28]
colnames(data)[1] <- "Type"
## Let's remove the questions and the ID
data <- data[,c(1,4:28)]

## Convering values to numeric type
data[, 2:26] <- sapply(data[, 2:26], as.numeric)
data[,1] <- sapply(data[, 1], as.factor)

## Let's remove NAs
data <- data %>% filter(!is.na(data[,2]))

Now that we have a clean data set, we need to melt the data to make it tidy. This step will also allow us to summarize the data as needed.

## Let's do a summary of the data
# First, we gather all the data to make sure we can add the values
data.melted <- data %>% gather(key = Country, value = Value, -Type)

# Next, we add all the values together to get a score.
data.summary <- data.melted %>% group_by(Type, Country) %>% 
                            summarise(Total = sum(Value))

The end product is a data frame that contains the total sum (ranking) of each country by question type. With the summary in hand, it is possible for us to filter the information and display it on a map.

# Let's have a varible displaying the type of data
type <- unique(data.summary$Type)

# Let's filter the data based on the information type. Change this value from
# 1 to 6 based on the type you want to display
data.display <- data.summary %>% filter(Type == type[2])

In the above code, you can change the type value according to the ranking you prefer to display in the map. Choose from the following:

Map Visualization

Generating the map is a simple task. The initial step was to obtain the layers of all countries. This step can be done by searching for the layers online or using a GIS software. In my case I searched the world map online and used QGIS software to select the countries I needed.

The second step is to create a pallet of colors - This will be generate a gradient based on our data. For this task I used the colorNumeric function with a pallete of Greens.

# Reading the map - obtained via GIS
map <- readOGR("map", layer = "latinAmerica", verbose = FALSE)
map@data <- merge(map@data, data.display, by.x = "NAME", by.y ="Country",  sort = FALSE)

# Let's generate some colors
pal <- colorNumeric(palette = "Greens", domain = data.display$Total)

stamen_tiles <- "http://{s}.tile.stamen.com/toner-lite/{z}/{x}/{y}.png"

## Setting up the pop-up
map.popup <- paste0("<strong>Country: </strong>",
                       map@data$NAME,
                       "<br><strong>Total: </strong>",
                       map[['Total']] )

latinAmerica.map <- leaflet(data = map) %>% addTiles() %>% 
    addTiles(urlTemplate = stamen_tiles) %>%
    setView(lat=--12.168, lng=-86.704, zoom = 4) %>%
    addPolygons(fillColor =~pal(map[['Total']]), 
                fillOpacity = 0.7, 
                color = "#BDBDC3", 
                weight = 1, 
                popup = map.popup) %>%
    addLegend("bottomright", 
              pal = pal, 
              values =~Total,
              title = paste("Ranking en",map@data$Type[1]),
              opacity = 1)

## Printing the map
latinAmerica.map 

The Final map is overlayed on top of the interactive map (similar to google maps). This is done with the help of MapBox tiles.

Acknowledgments

The data was collected and provided by Rodrigo Sandoval from the University of Toluca in Mexico. Many thanks to him for making the data open.